Sample clustering

PCA

##               PC1        PC2       group condition replicate   name
## WT1    -15.446124   6.985017   Control:1   Control         1    WT1
## WT2    -10.845993  13.371109   Control:2   Control         2    WT2
## WT3    -14.324961   4.010518   Control:3   Control         3    WT3
## Ab1     -8.697130  -8.343581     Light:1     Light         1    Ab1
## Ab2     -9.401742  -8.836389     Light:2     Light         2    Ab2
## Ab3    -11.008402  -7.357857     Light:3     Light         3    Ab3
## Wg1     18.418470  -1.204799       Wnt:1       Wnt         1    Wg1
## Wg2      7.851189 -10.885567       Wnt:2       Wnt         2    Wg2
## Wg3     13.544888   1.181744       Wnt:3       Wnt         3    Wg3
## Wg_Ab1  15.524041   5.254569 Wnt_Light:1 Wnt_Light         1 Wg_Ab1
## Wg_Ab2  14.385764   5.825236 Wnt_Light:2 Wnt_Light         2 Wg_Ab2

Number of genes detected

rRNA percentage

Proportions of gene biotypes

Remove rRNA from the analysis

dds = dds[!(row.names(counts(dds)) %in% ensembl.genes$gene_id[ensembl.genes$gene_biotype %in% c("rRNA", "snoRNA", "snRNA")]),]
dds = dds[rowSums(counts(dds)) > 0,]

Check rRNA removed

Check chrM intact

## 
## chr2L chr2R chr3L chr3R  chr4  chrX  chrY  chrM 
##  2634  2929  2698  3362    80  2157    41    37

Clustering after rRNA removal

PCA after rRNA removal

##               PC1        PC2       group condition replicate   name
## WT1    -15.446207   6.976894   Control:1   Control         1    WT1
## WT2    -10.803245  13.412923   Control:2   Control         2    WT2
## WT3    -14.326604   4.023965   Control:3   Control         3    WT3
## Ab1     -8.722458  -8.353524     Light:1     Light         1    Ab1
## Ab2     -9.403147  -8.828752     Light:2     Light         2    Ab2
## Ab3    -10.999116  -7.357691     Light:3     Light         3    Ab3
## Wg1     18.460587  -1.162145       Wnt:1       Wnt         1    Wg1
## Wg2      7.809082 -10.899188       Wnt:2       Wnt         2    Wg2
## Wg3     13.555007   1.189389       Wnt:3       Wnt         3    Wg3
## Wg_Ab1  15.495600   5.197507 Wnt_Light:1 Wnt_Light         1 Wg_Ab1
## Wg_Ab2  14.380501   5.800622 Wnt_Light:2 Wnt_Light         2 Wg_Ab2

Size Factors

##       WT1       WT2       WT3       Ab1       Ab2       Ab3       Wg1       Wg2 
## 1.0823335 0.8944507 0.8968479 1.5526588 0.9047126 1.2777171 0.9656761 0.9258778 
##       Wg3    Wg_Ab1    Wg_Ab2 
## 1.0729766 0.8827814 0.7883241

MA Plots

## [1] 1
## [1] 2

## [1] 3

## [1] 2
## [1] 3

## [1] 1
## [1] 2

## [1] 3

## [1] 2
## [1] 3

## [1] 1
## [1] 2

## [1] 3

## [1] 2
## [1] 3

## [1] 1
## [1] 2

Normalised counts (boxplot)

The genes with greater than 5^{5} normalised counts are:

seqnames start end width strand gene_id gene_biotype entrezgene_id external_gene_name
FBgn0000079 chr2R 17118636 17120303 1668
FBgn0000079 protein_coding 47764 Amy-p
FBgn0003356 chr3R 29923754 29924615 862
FBgn0003356 protein_coding 43544 Jon99Cii
FBgn0003357 chr3R 29922216 29923455 1240
FBgn0003357 protein_coding 43543 Jon99Ciii
FBgn0003863 chr2R 11344260 11345119 860
FBgn0003863 protein_coding 48316 alphaTry
FBgn0013674 chrM 1474 3009 1536
FBgn0013674 protein_coding 192469 mt:CoI
FBgn0033774 chr2R 12762113 12763825 1713
FBgn0033774 protein_coding 36410 CG12374
FBgn0035665 chr3L 6050757 6051690 934
FBgn0035665 protein_coding 38683 Jon65Aiii
FBgn0036024 chr3L 9641013 9641913 901
FBgn0036024 protein_coding 39125 CG18180
FBgn0040060 chr3L 6035216 6036175 960
FBgn0040060 protein_coding 38680 yip7
FBgn0250815 chr3L 6039155 6040135 981
FBgn0250815 protein_coding 38682 Jon65Aiv

These all seem to be protein-coding genes, so we will not remove them.

Normalised counts (Transcripts per million (TPM))

The genes with greater than 2^{4} TPM are:

seqnames start end width strand gene_id gene_biotype entrezgene_id external_gene_name
FBgn0002868 chr3R 9783407 9784370 964
FBgn0002868 protein_coding 41202 MtnA
FBgn0003356 chr3R 29923754 29924615 862
FBgn0003356 protein_coding 43544 Jon99Cii
FBgn0003863 chr2R 11344260 11345119 860
FBgn0003863 protein_coding 48316 alphaTry
FBgn0004426 chr3L 1210485 1210726 242
FBgn0004426 pseudogene 38126 LysC
FBgn0036024 chr3L 9641013 9641913 901
FBgn0036024 protein_coding 39125 CG18180
FBgn0040060 chr3L 6035216 6036175 960
FBgn0040060 protein_coding 38680 yip7
FBgn0040687 chr3R 4335098 4335506 409
FBgn0040687 protein_coding 50160 CG14645
FBgn0066084 chr2R 24903416 24903935 520
FBgn0066084 protein_coding 251466 RpL41
FBgn0250815 chr3L 6039155 6040135 981
FBgn0250815 protein_coding 38682 Jon65Aiv

Since there is a pseudogene with high TPM counts, we’ll remove this from the final dataset.

Session Info

## R version 4.0.5 (2021-03-31)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] kableExtra_1.3.4            Biostrings_2.58.0          
##  [3] XVector_0.30.0              scales_1.1.1               
##  [5] reshape2_1.4.4              knitr_1.31                 
##  [7] biomaRt_2.46.3              GenomicFeatures_1.42.3     
##  [9] AnnotationDbi_1.52.0        genefilter_1.72.1          
## [11] ggplot2_3.3.3               DESeq2_1.30.1              
## [13] SummarizedExperiment_1.20.0 Biobase_2.50.0             
## [15] MatrixGenerics_1.2.1        matrixStats_0.58.0         
## [17] GenomicRanges_1.42.0        GenomeInfoDb_1.26.7        
## [19] IRanges_2.24.1              S4Vectors_0.28.1           
## [21] BiocGenerics_0.36.0        
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-152             bitops_1.0-6             bit64_4.0.5             
##  [4] webshot_0.5.2            RColorBrewer_1.1-2       progress_1.2.2          
##  [7] httr_1.4.2               rprojroot_2.0.2          tools_4.0.5             
## [10] bslib_0.2.4              utf8_1.2.1               R6_2.5.0                
## [13] mgcv_1.8-34              DBI_1.1.1                colorspace_2.0-0        
## [16] withr_2.4.1              tidyselect_1.1.0         prettyunits_1.1.1       
## [19] bit_4.0.4                curl_4.3                 compiler_4.0.5          
## [22] rvest_1.0.0              xml2_1.3.2               DelayedArray_0.16.3     
## [25] labeling_0.4.2           rtracklayer_1.50.0       sass_0.3.1              
## [28] askpass_1.1              rappdirs_0.3.3           systemfonts_1.0.1       
## [31] stringr_1.4.0            digest_0.6.27            Rsamtools_2.6.0         
## [34] svglite_2.0.0            rmarkdown_2.7            pkgconfig_2.0.3         
## [37] htmltools_0.5.1.1        highr_0.8                dbplyr_2.1.1            
## [40] fastmap_1.1.0            rlang_0.4.10             rstudioapi_0.13         
## [43] RSQLite_2.2.6            farver_2.1.0             jquerylib_0.1.3         
## [46] generics_0.1.0           jsonlite_1.7.2           BiocParallel_1.24.1     
## [49] dplyr_1.0.5              RCurl_1.98-1.3           magrittr_2.0.1          
## [52] GenomeInfoDbData_1.2.4   Matrix_1.3-2             Rcpp_1.0.6              
## [55] munsell_0.5.0            fansi_0.4.2              lifecycle_1.0.0         
## [58] stringi_1.5.3            yaml_2.2.1               zlibbioc_1.36.0         
## [61] plyr_1.8.6               BiocFileCache_1.14.0     grid_4.0.5              
## [64] blob_1.2.1               crayon_1.4.1             lattice_0.20-41         
## [67] splines_4.0.5            annotate_1.68.0          hms_1.0.0               
## [70] locfit_1.5-9.4           pillar_1.6.0             geneplotter_1.68.0      
## [73] XML_3.99-0.6             glue_1.4.2               evaluate_0.14           
## [76] vctrs_0.3.7              gtable_0.3.0             openssl_1.4.3           
## [79] purrr_0.3.4              assertthat_0.2.1         cachem_1.0.4            
## [82] xfun_0.22                xtable_1.8-4             viridisLite_0.4.0       
## [85] survival_3.2-10          tibble_3.1.0             GenomicAlignments_1.26.0
## [88] memoise_2.0.0            ellipsis_0.3.1